48 research outputs found

    Selection of robust features for the Cover Source Mismatch problem in 3D steganalysis

    Get PDF
    This paper introduces a novel method for extracting sets of feature from 3D objects characterising a robust stegan- alyzer. Specifically, the proposed steganalyzer should mitigate the Cover Source Mismatch (CSM) paradigm. A steganalyzer is considered as a classifier aiming to identify separately cover and stego objects. A steganalyzer behaves as a classifier by considering a set of features extracted from cover stego pairs of 3D objects as inputs during the training stage. However, during the testing stage, the steganalyzer would have to identify whether specific information was hidden in a set of 3D objects which can be different from those used during the training. Addressing the CSM paradigm corresponds to testing the generalization ability of the steganalyzer when introducing distortions in the cover objects before hiding information through steganography. Our method aims to select those 3D features that model best the changes introduced in objects by steganography or information hiding and moreover they are able to generalize for different objects, not present in the training set. The proposed robust steganalysis approach is tested when considering changes in 3D objects such as those produced by mesh simplification and additive noise. The results obtained from this study show that the steganalyzers trained with the selected set of robust features achieve better detection accuracy of the changes embedded in the objects, when compared to other sets of features

    Human Group Activity Recognition based on Modelling Moving Regions Interdependencies

    Get PDF
    n this research study, we model the interdepen- dency of actions performed by people in a group in order to identify their activity. Unlike single human activity recognition, in interacting groups the local movement activity is usually influenced by the other persons in the group. We propose a model to describe the discriminative characteristics of group activity by considering the relations between motion flows and the locations of moving regions. The inputs of the proposed model are jointly represented in time-space and time-movement spaces. These spaces are modelled using Kernel Density Estimation (KDE) which is then fed into a machine learning classifier. Unlike in other group-based human activity recognition algorithms, the proposed methodology is automatic and does not rely on any pedestrian detection or on the manual annotation of tracks. Index Terms —Group Activity Identification, Motion Segm

    Group Activity Recognition on Outdoor Scenes

    Get PDF
    In this research study, we propose an automatic group activity recognition approach by modelling the interdependencies of group activity features over time. Unlike in simple human activity recognition approaches, the distinguishing characteristics of group activities are often determined by how the movement of people are influenced by one another. We propose to model the group interdependences in both motion and location spaces. These spaces are extended to time-space and time-movement spaces and modelled us- ing Kernel Density Estimation (KDE). Such representations are then fed into a machine learning classifier which iden- tifies the group activity. Unlike other approaches to group activity recognition, we do not rely on the manual annota- tion of pedestrian tracks from the video sequence

    3D Mesh Steganalysis using local shape features

    Get PDF
    Steganalysis aims to identify those changes performed in a specific media with the intention to hide information. In this paper we assess the efficiency, in finding hidden information, of several local feature detectors. In the proposed 3D ste- ganalysis approach we first smooth the cover object and its corresponding stego-object obtained after embedding a given message. We use various operators in order to extract lo- cal features from both the cover and stego-objects, and their smoothed versions. Machine learning algorithms are then used for learning to discriminate between those 3D objects which are used as carriers of hidden information and those are not used. The proposed 3D steganalysis methodology is shown to provide superior performance to other approaches in a well known database of 3D objects

    Learning Spatio-Temporal Representations with Temporal Squeeze Pooling

    Get PDF
    In this paper, we propose a new video representation learning method, named Temporal Squeeze (TS) pooling, which can extract the essential movement information from a long sequence of video frames and map it into a set of few images , named Squeezed Images. By embedding the Temporal Squeeze pooling as a layer into off-the-shelf Convolution Neural Networks (CNN), we design a new video classification model, named Temporal Squeeze Network (TeSNet). The resulting Squeezed Images contain the essential movement information from the video frames, corresponding to the optimization of the video classification task. We evaluate our architecture on two video classification benchmarks, and the results achieved are compared to the state-of-the-art

    Compressing Cross-Domain Representation via Lifelong Knowledge Distillation

    Get PDF
    Most Knowledge Distillation (KD) approaches focus on the discriminative information transfer and assume that the data is provided in batches during training stages. In this paper, we address a more challenging scenario in which different tasks are presented sequentially, at different times, and the learning goal is to transfer the generative factors of visual concepts learned by a Teacher module to a compact latent space represented by a Student module. In order to achieve this, we develop a new Lifelong Knowledge Distillation (LKD) framework where we train an infinite mixture model as the Teacher which automatically increases its capacity to deal with a growing number of tasks. In order to ensure a compact architecture and to avoid forgetting, we propose to measure the relevance of the knowledge from a new task for a set of experts making up the Teacher module, guiding each expert to capture the probabilistic characteristics of several similar domains. The network architecture is expanded only when learning an entirely different task. The Student is implemented as a lightweight probabilistic generative model. The experiments show that LKD can train a compressed Student module that achieves the state of the art results with fewer parameters

    Co-attention enabled content-based image retrieval

    Get PDF
    Content-based image retrieval (CBIR) aims to provide the most similar images to a given query. Feature extraction plays an essential role in retrieval performance within a CBIR pipeline. Current CBIR studies would either uniformly extract feature information from the input image and use them directly or employ some trainable spatial weighting module which is then used for similarity comparison between pairs of query and candidate matching images. These spatial weighting modules are normally query non-sensitive and only based on the knowledge learned during the training stage. They may focus towards incorrect regions, especially when the target image is not salient or is surrounded by distractors. This paper proposes an efficient query sensitive co-attention\footnote{``Co-attention'' in this paper refers to spatial attention conditioned on the query content.} mechanism for large-scale CBIR tasks. In order to reduce the extra computation cost required by the query sensitivity to the co-attention mechanism, the proposed method employs clustering of the selected local features. Experimental results indicate that the co-attention maps can provide the best retrieval results on benchmark datasets under challenging situations, such as having completely different image acquisition conditions between the query and its match image

    Dynamic Scalable Self-Attention Ensemble for Task-Free Continual Learning

    Get PDF
    Continual learning represents a challenging task for modern deep neural networks due to the catastrophic forgetting following the adaptation of network parameters to new tasks. In this paper, we address a more challenging learning paradigm called Task-Free Continual Learning (TFCL), in which the task information is missing during the training. To deal with this problem, we introduce the Dynamic Scalable Self-Attention Ensemble (DSSAE) model, which dynamically adds new Vision Transformer (ViT) based-experts to deal with the data distribution shift during the training. To avoid frequent expansions and ensure an appropriate number of experts for the model, we propose a new dynamic expansion mechanism that evaluates the novelty of incoming samples as expansion signals. Furthermore, the proposed expansion mechanism does not require knowing the task information or the class label, which can be used in a realistic learning environment. Empirical results demonstrate that the proposed DSSAE achieves state-of-the-art performance in a series of TFCL experiments

    InfoVAEGAN : Learning Joint Interpretable Representations by Information Maximization and Maximum Likelihood

    Get PDF
    Learning disentangled and interpretable representations is an important step towards accomplishing comprehensive data representations on the manifold. In this paper, we propose a novel representation learning algorithm which combines the inference abilities of Variational Autoencoders (VAE) with the generalization capability of Generative Adversarial Networks (GAN). The proposed model, called InfoVAEGAN, consists of three networks : Encoder, Generator and Discriminator. InfoVAEGAN aims to jointly learn discrete and continuous interpretable representations in an unsupervised manner by using two different data-free log-likelihood functions onto the variables sampled from the generator’s distribution. We propose a two-stage algorithm for optimizing the inference network separately from the generator training. Moreover, we enforce the learning of interpretable representations through the maximization of the mutual information between the existing latent variables and those created through generative and inference processes

    Expressive Local Feature Match for Image Search

    Get PDF
    Content-based image retrieval (CBIR) aims to search the most similar images to a given query content, from a large pool of images. Existing state of the art works would extract a compact global feature vector for each image and then evaluate their similarity. Although some CBIR works utilize local features to get better retrieval results, they either would require extra codebook training or use re-ranking for improving the retrieved results. In this work, we propose a many-to-many local feature matching for large scale CBIR tasks. Unlike existing local feature based algorithms which tend to extract large amounts of short-dimensional local features from each image, the characteristic feature representation in the proposed approach is modeled for each image aiming to employ fewer but more expressive local features. Characteristic latent features are selected using k-means clustering and then fed into a similarity measure, without using complex matching kernels or codebook references. Despite the straightforwardness of the proposed CBIR method, experimental results indicate state of art results on several benchmark datasets
    corecore